Brill Tagging using the Micron Automata Processor
نویسندگان
چکیده
Brill tagging is a classic rule-based algorithm for part-of-speech tagging within Natural Language Processing. However, implementation of the tagger is inherently slow on conventional Von Neumann architectures. In this paper, we accelerate the second stage of Brill tagging on the Micron Automata Processor, a new computing architecture that can perform massive pattern matching in parallel. The designed structure is tested with a subset of the Brown Corpus using 218 contextual rules. The results show a 38X speed-up for the second stage tagger implemented on a single AP chip, compared to a single thread implementation on CPU. This paper introduces the use of this new accelerator for computational linguistic tasks, particularly those that involve rule-based or pattern-matching approaches. Keywords-Part-of-speech tagging; Brill tagging; the Automata Processor; Natural Language Processing
منابع مشابه
Uses for Random and Stochastic Input on Micron’s Automata Processor
Micron’s Automata Processor (AP) is a configurable memorybased device, purpose-built to emulate a theoretical nondeterministic finite automata (NFA). While NFAs are not particularly suited for floating point computation, they are extremely powerful and efficient pattern matchers and have been shown to provide large speedups over traditional von Neumann execution for rule-based, data-mining appl...
متن کاملAutomata Designs for Data Encryption with AES using the Micron Automata Processor
Cybersecurity has become the most important issue in the current era of cyber warfare. Significant advantages can be obtained from using co-processing units when a cyberattack diminished the computing power of the main processor cores from carried out useful tasks. Automata Processor (AP) is a novel accelerator from Micron Technology, which is based on the non-von Neumann architecture and the p...
متن کاملFast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal
Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...
متن کاملPart of Speech Tagging with Mixed Approaches of Neural Networks and Transformation Rules
For the purpose of constructing a practical part of speech tagger that uses as few training data as possible, an approach using neural networks, which uses di erent lengths of contexts based on longest context priority and takes into account the maximization of information amount, have been proposed so far. To further improve the tagging performance, this paper proposes an integrated approach o...
متن کاملPart-of-Speech Tagging Using the Brill Method
Part-of-speech tagging is the process of associating each word in a text with it’s part-of-speech category and possibly a set of morphosyntactic features. This information is represented by part-of-speech tags. This paper describes an implementation of a part-of-speech tagger for Swedish based on the Brill method. The basic idea is to apply a set of rules to an initial annotation achieved using...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014